Network path discovery and analysis

ABSTRACT

A network analysis system invokes an application specific, or source-destination specific, path discovery process. The application specific path discovery process determines the path(s) used by the application, collects performance data from the nodes along the path, and communicates this performance data to the network analysis system for subsequent performance analysis. The system may also maintain a database of prior network configurations to facilitate the identification of nodes that are off the path that may affect the current performance of the application. The system may also be specifically controlled so as to identify the path between any pair of specified nodes, and to optionally collect performance data associated with the path.

This application claims the benefit of U.S. Provisional PatentApplications 61/249,287, filed 7 Oct. 2009, and 61/374,064, filed 16Aug. 2010.

BACKGROUND AND SUMMARY OF THE INVENTION

This invention relates to the field of network analysis, and inparticular to a system and method that facilitate the discovery of nodesalong the path of a network application, or between any two identifiednodes, and the subsequent collection and analysis of performanceparameters associated with these nodes and related nodes.

The ever increasing use of applications that communicate across anetwork has led to changes in the conventional ‘network management’role. A network manager is generally concerned with the overallhealth/performance of the network. However, each user of an applicationis affected by the performance of the particular application on anetwork, and is relatively uninterested in overall performance measureswhen the particular application exhibits poor performance on theoverall-healthy network. Accordingly, the network manager must besensitive to application-specific performance problems.

If a user of the application reports a problem, such as long delaytimes, a network manager generally needs to analyze the performance ofthe application server node as well as each node in the network alongthe path between the user and the application server. In like manner,determining the path between two identified nodes will also facilitatepreventive maintenance tasks, security analysis, planning tasks, or anyother task that requires path identification.

Identifying the nodes along a path is typically a two-step process.Using the OSI network model, a path can be defined by the network layernodes, or layer-3 nodes, and a more detailed path can be defined by thedata-link layer devices, or layer-2 devices. Network layer nodesgenerally provide dynamic switching, based for example on the currentcontents of a routing table at the network layer node. Typically, thenetwork layer path between the two nodes is found, then the data linklayer devices that lie along the determined path are identified.

There are two common techniques used to determine the network layer pathbetween a source node and a destination node, an ‘active’ technique thatincludes sending trace messages from the source node to the destination,and a ‘passive’ technique that includes sequentially investigating theconfiguration of the routers to determine the ‘next hop’ toward thedestination.

U.S. Pat. No. 7,742,426, “SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUMFOR DETERMINING A LAYER-2 TRACE IN A HETEROGENEOUS NETWORK SYSTEM”,issued 22 Jun. 2010 to Schumacher et al., discloses using a tracerequest to identify the network layer path between a pair of nodes, thenfinding the layer-2 devices along each of the identified links formingthe path, and is incorporated by reference herein.

U.S. Pat. No. 7,293,106, “METHOD OF FINDING A PATH BETWEEN TWO NODES INA NETWORK”, issued 6 Nov. 2007 to Natarajan et al. and incorporated byreference herein, discloses sequentially identifying each next hop basedon routing tables, and identifying the data link layer devices along thehop based on a network topology database.

U.S. Pat. No. 7,760,735, “METHOD AND SYSTEM FOR DISCOVERING NETWORKPATHS”, issued 20 Jul. 2010 to Chen et al. and incorporated by referenceherein, discloses querying network devices for their currentconfiguration, including routing tables, and sequentially proceedingalong the path identified by the next-hop information, using interfacedefinitions at each device to identify the data link layer devices alongthe path.

While these prior art systems are effective for finding devices along apath between two nodes, they each rely on having access to certainfeatures or capabilities that may or may not be available to aparticular network manager. For example, Schumacher relies on havingaccess to the source node in order to send the trace request to thedestination node. Often, the network is provided by a third-partyprovider, and the user at the source node may be reluctant to allow thisthird-party to access the node. In like manner, Natarajan relies on thefact that the data link layer topology of the network is known. In manycases, the path between two nodes may extend across ‘foreign’ networks,such as public networks, for which topological information is notavailable. Similarly, Chen relies on being able to query each devicealong the path of next-hops, presuming that all of the network devicesare freely accessible. If a device cannot be queried directly for thenext hop, but responds to SNMP requests, the entire routing table wouldneed to be downloaded and processed to determine the next hop. In alarge, complex network, the routing tables can be quite large, and itmay not be feasible to download all of the routing tables for devicesthat cannot be queried directly for the next hop.

The analysis of an application specific problem is often compounded ifthe cause of the problem is a node that is not in the application pathbut impacts a node that is in the path. For example, most networks arefault-tolerant, such that when a node on a path fails, the path isautomatically altered to avoid the failed node. If the alternative pathinherently has poorer performance than the original path, the user willtypically report a degradation in the application's performance.However, an assessment of the nodes along this new path will notidentify the problem, because each node on the new path will be workingproperly.

It would be advantageous to integrate the variety of techniques used inthe path discovery process. It would also be advantageous to automatethe use of alternative techniques during the path discovery process. Itwould also be advantageous to identify nodes that are not on the paththat may be impacting the nodes on the path.

These advantages, and others, may be realized by a network analysissystem that automatically invokes different path discovery techniques,based on the conditions found as the path discovery process proceeds.The path discovery process determines the path(s) used by theapplication, collects performance data from the nodes along the path,and communicates this performance data to the problem reporting systemfor subsequent performance analysis. The system may also maintain adatabase of prior network configurations to facilitate theidentification of nodes that are off the path that may affect thecurrent performance of the application.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is explained in further detail, and by way of example,with reference to the accompanying drawings wherein:

FIG. 1 illustrates an example network of elements on a path between asource and destination node, and an example monitoring system fordetermining the path and collecting data associated with the elements onthe path.

FIG. 2 illustrates an example flow diagram for finding the path betweena source and destination node.

FIG. 3 illustrates an example flow diagram for an alternative techniquefor finding the path between a source and destination node.

FIG. 4 illustrates an example flow diagram for collecting performancemetrics associated with a path between a source and destination node.

FIG. 5 illustrates an example flow diagram of the use of the monitoringsystem for planning purposes.

FIG. 6 illustrates an example flow diagram of the use of the monitoringsystem to identify anomalous performance.

FIG. 7 illustrates an example flow diagram of the use of the monitoringsystem to respond to alarms.

FIG. 8 illustrates an example block diagram of a monitoring system.

Throughout the drawings, the same reference numerals indicate similar orcorresponding features or functions. The drawings are included forillustrative purposes and are not intended to limit the scope of theinvention.

DETAILED DESCRIPTION

In the following description, for purposes of explanation rather thanlimitation, specific details are set forth such as the particulararchitecture, interfaces, techniques, etc., in order to provide athorough understanding of the concepts of the invention. However, itwill be apparent to those skilled in the art that the present inventionmay be practiced in other embodiments, which depart from these specificdetails. In like manner, the text of this description is directed to theexample embodiments as illustrated in the Figures, and is not intendedto limit the claimed invention beyond the limits expressly included inthe claims. For purposes of simplicity and clarity, detaileddescriptions of well-known devices, circuits, and methods are omitted soas not to obscure the description of the present invention withunnecessary detail.

FIG. 1 illustrates an example subset of a network, the subset comprisinga source node S, destination node D, monitoring node M, elements E1-E4along a path between the source S and destination D nodes, elementsE6-E7 between the monitoring node M and element E1, and elements E8-E9between the elements E6 and E4. The network is assumed to have otherelements coupled to the illustrated nodes and elements, and theillustrated paths will not be immediately apparent among the linkscoupling all of these other elements. The terms ‘node’ and ‘element’ areused herein to facilitate a distinction between “end-nodes” (or “nodesof interest”) and “intermediate nodes” (or “data-passing” nodes),although the terms are synonymous in the art and in the context of thisdisclosure. That is each ‘element’ will generally be a node of anetwork, and each ‘node’ may be an element in a path between a differentset of source and destination nodes.

The example elements E1-E9 will generally be routing elements that aredynamically configured to establish the paths for communications betweenthe nodes. This dynamic configuration may include configurations thatare manually defined by a network administrator, automatically definedbased on metrics associated with the available links between nodes, or acombination of these and/or any other routing technique common in theart. In standard OSI terminology, these elements E1-E9 are typicallyelements at the network layer, or layer-3 elements. Other devices willgenerally exist between these network layer elements, at the data-linklayer (layer-2), but these devices will generally have staticconfigurations that do not directly affect the routing of a messagebetween nodes.

Also illustrated in FIG. 1 is a controller C that controls the monitornode M in response to alerts or other events associated with thenetwork. In a typical scenario, the controller C will be alerted to apotential problem associated with communications between a given set ofsource S and destination D nodes. As noted above, this alert may be inthe form of a user complaint that a particular application thatcommunicates between the source and destination is experiencing aperformance degradation. Alternatively, the alert may be automaticallygenerated by an application when communications between asource-destination pair drop below some performance threshold. One ofskill in the art will recognize that other techniques of alerting acontroller may be used, the source of the alert being immaterial to theprinciples of this invention.

Upon receipt of an alert that a given source-destination pair may beexperiencing problems, or any other request for information regardingcommunications between the source-destination pair, the controllerdirects the monitor node M to determine the current path between thesenodes. The technique(s) used to determine the current path between thesource S and destination D will generally depend upon the level ofcommunication control that the monitor node M can achieve with the givennodes and the elements between them.

FIG. 2 illustrates an example flow diagram for determining the pathbetween a source node S and a destination node D, depending upon thelevel of communication control achievable by the monitor node M.

At 210, the monitor node M may attempt to control the source node toinitiate a communication with the destination node. In an exampleembodiment, the controller C may send access information to the monitornode M that allows the monitor node M to remotely access the source nodeS to initiate this communication with D. This access information mayhave been stored in a database associated with the controller C, or itmay have been provided in the alert that was communicated to thecontroller C.

If the monitor node M is able to initiate communications from the sourcenode S, the monitor node M may send a “traceroute” or similar command tothe destination node D, at 240. As noted above, utilities are oftenprovided for generating traces as a message traverses a network. A‘traceroute’ or ‘tracepath’ message, for example, is configured to causeeach element that receives the message to respond to the sender, even ifthat element is not the intended recipient of the message. The responsemay be a notification of receiving the message, or a notification offorwarding the message, or combinations of such or similarnotifications. In this manner, each element that passes the message to anext element along the path to the intended recipient (destination)notifies the sender that it received the message. The utility programsthat cause these traceroute messages to be transmitted are configured toreceive the responses, and to calculate one or more metrics associatedwith the transmission and reception, such as the duration of timebetween the transmission and reception.

For ease of reference and understanding, the term ‘trace request’ isused herein in the general sense meaning any communication utility thatsends one or more messages that cause intermediate elements along thepath of a network to send a notification that enables the utility totrace the path of the message(s). Also for ease of reference, unlessotherwise noted, the ‘path’ discussed below is the network layer path(layer-3 path).

If, at 250, the trace request utility successfully identifies the pathfrom the source S to the destination D, this path is returned, at 290.However, in some cases, the trace request utility will fail to determinethe complete path to the destination D, and will report only a portionof the path from the source S toward the destination D.

In the event, at 250, that the complete path is not discovered by thetrace request, the monitor node M may attempt to discover the remainderof the path by investigating the configuration of the last element thatthe trace request was able to identify, at 260. For example, withreference to FIG. 1, if the trace request identifies the portion of thepath from source S to destination D as S-E1-E2-E3, and is unable todetermine that element E4 is the next element along the path, themonitor node M may access a routing table at element E3 to determinethat the ‘next hop’ for a message addressed to destination D will beelement E4. Thereafter, the monitor node M may revert to sending a tracerequest from element E4 to determine that the destination D is withinone hop of element E4, or the monitor node M may continue to determinethe path by accessing the routing table, or other configurationinformation, at element E4. At 270, the trace request determined portionS-E1-E2-E3 is concatenated with the configuration determined pathE3-E4-D, and the complete path S-E1-E2-E3-E4-D is returned, at 290.

If, at 210, the monitor node M is not able to initiate communicationsfrom the source node S, the monitor node M communicates a trace requestto the source node S, at 220 to find a path from the monitor node M tothe source node S. The monitor node M then searches for a controllablenode that is nearest to the source node, at 225. For example, on thetrace request result (M-E6-E7-E1-S), the monitor node determines thatelement E1 is the closest identified element to the source node S. Ifcommunications from element E1 can be initiated by the monitor node M,the monitor node M initiates a trace request from element E1 to thedestination node D, at 230. The resultant path from element E1 todestination node D is concatenated with the initially determined pathfrom the controllable element E1 to the source node S, at 235.

If communications from element E1 are not controllable by the monitornode M, the trace request to destination D is initiated from thenext-closest element (E7) at 230. In this case, the path from element E7to destination D will likely include element E1, or the trace requestcan be configured to force the trace request to go through element E1.Knowing that element E1 is connected to the source node S, theconcatenation of paths M-E6-E7-E1-S with the path E7-E1-E2-E3-E4-D, at235 will result in the exclusion of the portion M-E6-E7 as beingimmaterial to the path S-E1-E2-E3-E4-D from the source node S to thedestination node D.

As detailed above, if, at 250, the path determined at 235 is not acomplete path to destination node D, the configuration of the lastdetermined element along the path is assessed to determine the nextelement along the path to destination node D, at 260.

Optionally, the control element C may also direct the monitor node M todetermine the path from the destination D to the source S, using thetechniques detailed above with the roles of source and destinationreversed. Alternatively, the control element may assume that thedetermined path is symmetric in direction, and messages from destinationD to source S will travel along the same path.

FIG. 3 illustrates an example flow diagram for using configuration datato discover a next element along a truncated path determination, such asmight be used in 260 of FIG. 2. For ease of reference and understanding,it is assumed that the last element determined by the trace request is arouter, although it could be any of a variety of element types,typically OSI network layer elements.

At 310, the last element identified along the truncated path isidentified, and its routing table is accessed at 320. This routing tablemay be the actual routing table on the router, or it may be a routingtable stored within a model of the network, such as a simulation model.Based on the address of the destination node D, at 330, the routingtable is assessed to determine where messages addressed to that addresswill be routed, typically identified as a ‘next hop’ element.

There are alternative techniques available for accessing the routingtable, each having advantages and disadvantages. An SNMP request for thecurrent routing table at a device will return the entire routing table,which can be stored for subsequent use. This table may then be searchedto determine where messages addressed to the destination will be nextsent. Alternatively, a device or vendor specific command can be sent tothe router, such as “show ip route <dest>”, requesting that the routerspecifically identify the next hop on the path to the identifieddestination (<dest>).

The direct request to the router may be the most efficient for specificrequests, but different commands may be required for different devices,and secure access to the device is generally required. The SNMP requestdoes not require a secure access, and stored tables can be quicklyaccessed, but the amount of data in an actively used router may be verylarge, and receiving and processing the data can be time and resourcedemanding. In an embodiment of this invention, the system is designed toselectively choose whether to perform a direct request or an SNMPrequest, based on the aforementioned consideration; for example, an SNMPrequest may be sent only if the device cannot be directly accessed, orif the size of the routing table is expected to be below a giventhreshold.

If, at 340, the next hop element has been found, that element is addedas the next element along the determined path, at 350, and the pathdetermined thus far is returned. As noted above, having determined thenext element along the path, the process of FIG. 2 may merely continueto execute the flow of FIG. 3 until the complete path is found, or itmay revert to use of the trace request to determine the path from thisnext element to the destination node D, including, as required, a returnto the flow of FIG. 3 each time the trace request fails to identify anext element along the path.

If, at 340, the next hop element has not been found based on thecontents of the routing table, or other configuration data, one or morealternative methods may be used to determine the missing elements alongthe path, at 360, before the process completes, at 370. For example, asnoted above, the monitor node M can apply these path determinationtechniques to attempt to find a path from the destination node D to thesource node S. If a next hop cannot be found along the path from thesource S to the destination D, a reverse path from the destination Dtoward the source S may identify these missing elements, assuming asymmetry of paths to and from the source S and destination D. That is,for example, if the path from the source S to the destination D istruncated at element E3 (S-E1-E2-E3), the path from destination D tosource S may reveal a partial path D-E4-E3, from which it can be assumedthat messages at element E3 being sent to destination D will likely besent from element E3 to element E4, and from element E4 to destinationD.

Another alternative method, if the next hop element has not been found,at 340, is to generate “tracing traffic” along the currently determinedpath to the destination D. This tracing traffic includes one or moreidentifiers, either in the header information or the body content of themessages, that identify this traffic as tracing traffic. In such anembodiment, the network includes network collection nodes, such as‘sniffers’, that monitor traffic flowing along the link(s) in thenetwork, and the network analysis system is configured to identify whichof these collection nodes monitored the presence of this tracingtraffic. In this manner, if there are ‘gaps’ in the path determination,the system can at least determine the ‘tail end’ of the path as thetracing messages are detected along the path to the destination.

It is significant to note that the path determination need not includeeach and every device along the path. While the network analysis may bemore thorough or complete with more detailed information, determiningthat the tracing traffic was transmitted from one node and ‘eventually’arrived at a particular network collection node will often provide asufficient level of detail with regard to the path, because anyintermediate nodes between the transmitting node and the collection nodewill generally be beyond the control of the network manager who isunable to discover these intermediate nodes.

FIG. 4 illustrates an example flow diagram for collecting performancemetrics associated with a path between a source and destination node, asmay be embodied on a controller C of FIG. 1. At 410, the controlleridentifies one or more pairs of end-nodes of interest, and at 420, thecontroller instructs the monitor to determine the path between theseend-nodes, as detailed above.

The path determined by the monitor will generally identify the IPaddresses of each of the network layer elements along the path. At 430,the devices corresponding to these addresses are identified, typicallyby the monitor. The monitor may attempt to map these IP addresses tonetwork objects in its database and/or it may trigger automateddiscovery processes and metric polling for the IP address. The discoveryprocess may access the configuration of the element identified by theaddress, and may generally provide more detailed information about thepath, including, for example, identification of data link layer devicescoupled to, or embodied with, the identified network layer element.Conventional neighbor discovery protocols, such as CDP (Cisco DiscoveryProtocol) and other link inference techniques may be used to determinethe data link layer elements that provide the connectivity between thenetwork layer elements. Optionally, the discovery process may alsoinclude identifying other network layer elements that are reachable atan equivalent ‘cost’ as the network layer elements in the identifiedpath, and their performance metrics.

At 440, the controller instructs the monitor to collect performance andother metrics associated with each of the identified network objects. Asis known in the art, most trace request utilities return one or moremetrics, such as the response time, associated with each identified hopalong the path; the monitor may also be configured to determine suchmetrics as interface utilization, interface errors, device CPUutilization, SLA data, and so on. Preferably, these metrics providedevice level, hop level, and path/sub-path level information that can beused in any combination to understand the state of the network at anygiven time.

At 450, the current path, or paths, may be stored along with one or moreof the metrics associated with each path, and at 460, the user isprovided the opportunity to interact with the presentation of thedetermined path(s) and metrics, and perform subsequent analyses. Exampleanalyses are detailed below.

FIG. 5 illustrates an example flow diagram for an analysis conducted tofacilitate network planning and configuration, such as determiningalternative paths for connecting the identified end-nodes (e.g. sourceand destination nodes).

At 510, the end-nodes are identified, and at 520, the current path andmetrics associated with these nodes are determined, using the techniquesdetailed above. At 530, the user is provided the opportunity to defineone or more alternative paths between these nodes. The alternative pathsmay be manually defined, or determined using an automated routingprocess, such as a determination of routes via simulations of a model ofthe network using differing routing criteria, or simulations ofdifferent models (e.g. actual and proposed network configurations).

At 540, the differences between the different paths and theirperformance metrics are identified. Thereafter, at 550, the user canview the alternatives, try different paths, conduct particularsimulations, and so on, to determine whether an alternative path shouldbe implemented or assigned for communications between the identifiedend-nodes.

FIG. 6 illustrates an example flow diagram for an analysis conducted toisolate and diagnose anomalous performance of communications associatedwith an application.

At 610, an application anomaly is detected, for example by an auditingsystem that monitors performance of an application that accesses aparticular server. At 620, the users that are affected by the detectedanomaly are identified. That is, in the example of a particular server,the server is one of each pair of end-nodes, and the nodes that arebeing affected by the anomaly are each the other end-node of the pair.

At 630, the paths being affected by the anomaly are determined, based onthe identification of the pairs of end-nodes being affected. Because therecognition of a ‘pattern’ is typically more easily detected by a human,the system is configured to showcase (for example, by graphicallypresenting and highlighting) the affected paths, at 640.

The system may also facilitate the diagnostic process by presentingand/or highlighting any anomalous metrics, at 650. That is, for example,if a particular link along the path exceeds particular delay duration,that link may be distinguished from among the other links along thepath, using color, format, or other visual cue. Thereafter, at 660, thesystem allows the user to interact with the presented material to obtainmore detailed information, select alternative paths, and so on.

FIG. 7 illustrates an example flow diagram for an analysis that mayoccur upon receipt of an alarm, at 710, such as a notification by aclient that an application has been exhibiting a degradation inperformance. Often, a change in network performance occurs when a changeis introduced, either purposely or as a result of a non-purposefulevent, such as a failure of a device. When a network change isintroduced, the effects of the change will often affect multiple paths.For example, a failure of a device will generally cause each of thepaths that had used the device to be automatically rerouted toalternative paths. Given that the original, non-fault, paths were likelythe optimal paths for each user of the server, the change will likelyresult in a degradation in performance for all of the end-nodes that hadused the failed device in their routing.

The loop 720-750 processes each of the received alarms. At 730, the pathfor each alarm (each reported affected user) is determined, and metricsassociated with this path are collected, at 740. At 760, one or moreprior sets of paths and their associated metrics are recalled from astorage medium, and at 770, the differences between the current pathsand the prior paths are identified, as well as the differences betweentheir corresponding metrics.

At 780, one or more network diagrams are presented to the user, based onthe current and prior paths. Of particular note, because the priorpath(s) are included in the presentation, devices that are not currentlyon any current path but had been on the prior path (e.g. failed devices)are included in the presentation, thereby facilitating a diagnostic ofthe ‘cause’ of the alarm(s). Preferably, such potential causes arevisually distinguished in the presentation of the network diagram.

At 790, one or more reports are produced based on a comparison of thecurrent and prior paths. As in the example analyses of FIGS. 5 and 6,the user is also provided the opportunity to interact with the presentednetwork diagram and other performance reports related to the networkdiagram, to develop further details and/or to assess alternatives.

FIG. 8 illustrates an example block diagram of a network analysis system800 for determining performance parameters with regard to a source nodeS and a destination node D on a network 810. These nodes S, Dcommunicate via a communications path that includes one or moreintermediate elements E. These elements E may include routers thatdynamically determine the path between the source S and destination Dusing, for example, routing tables that indicate a next hop (nextelement E) associated with an address corresponding to destination D.The address corresponding to destination D may be the actual IP addressof destination D, a translation of the IP address of destination D, avirtual IP address of destination D, and so on.

A controller 850 of the network analysis system 800 communicates with amonitor M that is also on the network 810, so as to be able tocommunicate with at least some of the elements E. The controller 850also communicates with performance analysis tools 820 and routing tools880. The controller 850 interacts with a user of the system 800 via auser interface 860. Although the components of this system areillustrated as individual blocks, for ease of understanding, one ofskill in the art will recognize that the functional partitioning may bechanged, depending upon the particular embodiment of this invention. Forexample, the monitor M is shown separate from the controller 850 andother components in the system 800, although in many embodiments themonitor M and the controller 850 may be one integrated component in thesystem 800. In an alternative embodiment, the monitor M may, in fact, bemultiple monitors that are distributed about the network 810. In such anembodiment, some of the below described monitoring functions may beperformed by the controller 850, leaving the monitors to perform alimited set of these monitoring functions.

In a typical scenario, the user will request, via the interface 860,network performance information regarding the source S and destinationD. Alternatively, the controller 850 may receive an automated request,based, for example, on a performance parameter exceeding a giventhreshold. In response, the controller will attempt to determine thepath between source S and destination D, at least with respect to thenetwork layer (layer-3) path, and perhaps additional detail, such as thedata link layer (layer-2) devices that embody some or all segments ofthe network layer path.

As detailed above, depending upon the degree of control that the monitorhas over the source S, the elements E, and the destination D, this pathdetermination may include a mix of various path determining techniques,including, for example, determining portions of the path between thesource and destination nodes using trace requests, determining otherportions of the path based on configuration information associated withone or more elements along the path, and combining these determinedportions of the path. This process may be repeated, along withdetermining other portions of the path by other means, as detailedabove, until a network layer path between the source S and destination Dis determined. The controller 850 may also direct the monitor M toattempt to collect other information, such as identification of datalink layer devices that form the determined portions of the path.

If a network model 840 is available, the controller 850 may access themodel 840 to determine the aforementioned configuration data, or,depending upon the level of detail in the model, to identify the datalink layer devices that form the determined portions of the path.

The controller 850 may store the determined current path 830, tofacilitate subsequent analyses, to facilitate subsequent pathdeterminations, and to facilitate comparisons of paths over time. Forexample, if a subsequent problem is reported for the same pair ofsource-destination nodes, the controller 850 may access the prior path,then direct the monitor to report any differences in the configurationof devices along this prior path. If a routing table of one of thedevices along this prior path has changed with regard to thissource-destination pair, resulting in a diversion from the prior path,the path determination process need only start at this changed device,and need not re-determine the path from the source to this changeddevice. Additionally, as detailed above, determining where a path haschanged may indicate a problem caused by a device that is no longeralong the current path.

As also noted above, the controller 850 also directs the monitor M tocollect performance metrics relative to one or more of the devices alongthe determined path. This performance metrics may include, for example,response time associated with identified hops along the path, interfaceutilization, interface errors, device CPU utilization, SLA data, and soon. The controller 850 may also access any of a variety of performanceanalysis tools 820 to perform, for example, statistical analysis,threshold comparisons, anomaly identification, and so on.

The controller 850 may also access routing tools 880, such as networksimulators, protocol emulators, and so on, to derive alternative paths890, as detailed above. Optionally, the user may define alternativepaths 890. The controller may use one or more of the performanceanalysis tools 820 to compare the expected performance of the currentpath and these alternative paths, particularly if the system includes anetwork simulator that can be configured to provide expected performancemetrics.

The controller 850 provides the results of the analysis of theperformance metrics, as well as the metrics, to the user via the userinterface 860. The controller 850 may also be configured to communicatealerts and warnings to remote systems.

The foregoing merely illustrates the principles of the invention. Itwill thus be appreciated that those skilled in the art will be able todevise various arrangements which, although not explicitly described orshown herein, embody the principles of the invention and are thus withinits spirit and scope. For example, FIG. 2 illustrates a flow in whichthe trace request is used initially, and the configuration data is usedwhen the trace request is unable to identify the complete path betweenthe source and destination nodes. One of skill in the art will recognizethat the order and/or priority of applying the different pathdetermining techniques are immaterial to the principles of thisinvention. For example, with reference to FIG. 1, if communications fromeither the source node S or the element E1 are controllable by themonitor M, the monitor M may access the configuration of element E1 todetermine the next hop (E2) toward the destination, and then attempt toinitiate a trace request from element E2. These and other systemconfiguration and optimization features will be evident to one ofordinary skill in the art in view of this disclosure, and are includedwithin the scope of the following claims.

In interpreting these claims, it should be understood that:

-   -   a) the word “comprising” does not exclude the presence of other        elements or acts than those listed in a given claim;    -   b) the word “a” or “an” preceding an element does not exclude        the presence of a plurality of such elements;    -   c) any reference signs in the claims do not limit their scope;    -   d) several “means” may be represented by the same item or        hardware or software implemented structure or function;    -   e) each of the disclosed elements may be comprised of hardware        portions (e.g., including discrete and integrated electronic        circuitry), software portions (e.g., computer programming), and        any combination thereof;    -   f) hardware portions may include a processor, and software        portions may be stored on a non-transient computer-readable        medium, and may be configured to cause the processor to perform        some or all of the functions of one or more of the disclosed        elements;    -   g) hardware portions may be comprised of one or both of analog        and digital portions;    -   h) any of the disclosed devices or portions thereof may be        combined together or separated into further portions unless        specifically stated otherwise;    -   i) no specific sequence of acts is intended to be required        unless specifically indicated; and    -   j) the term “plurality of” an element includes two or more of        the claimed element, and does not imply any particular range of        number of elements; that is, a plurality of elements can be as        few as two elements, and can include an immeasurable number of        elements.

We claim:
 1. A non-transitory computer readable medium that includesinstructions for causing a processor to: receive an identification of asource node and a destination node of a source-destination pair in anetwork, send a first trace request from a monitor node to the sourcenode to determine a first path, find a controllable node on the firstpath, the controllable node being different from the source node and themonitor node, send a second trace request from the controllable node tothe destination node to determine a second path, and combine at least apart of the first path to the second path to determine a current pathbetween the source node and the destination node.
 2. The medium of claim1, wherein the instructions cause the processor to find the controllablenode by causing the processor to find the controllable node along thefirst path that is closest to the source node.
 3. The medium of claim 1,wherein the instructions cause the processor to determine the secondpath by causing the processor to determine at least a segment of thesecond path based on routing information associated with one or moreelements along the second path.
 4. The medium of claim 3, whereindetermining the segment includes accessing one or more routing tableswithin a model of the network.
 5. The medium of claim 1, wherein theinstructions cause the processor to: collect performance data associatedwith one or more elements on the determined current path, process theperformance data to provide performance information, and provide adisplay of the performance information.
 6. The medium of claim 5,wherein the instructions cause the processor to compare the determinedcurrent path to a prior path between the source and destination nodes,identify at least one other node based on a difference between thecurrent path and the prior path, collect other performance data for theat least one other node, and display other information based on theother performance data.
 7. The medium of claim 6, wherein theinstructions cause the processor to display a network diagram based onthe difference.
 8. The medium of claim 5, wherein the instructions causethe processor to identify abnormal performance based on the performancedata.
 9. The medium of claim 8, wherein the instructions cause theprocessor to display a network diagram based on the abnormalperformance.
 10. The medium of claim 1, wherein the instructions causethe processor to receive a request for network information associatedwith the source node and the destination node, wherein the requestincludes access information associated with the controllable node, andsending the second trace includes gaining access to the controllablenode based on the access information.
 11. The medium of claim 1, whereinthe controllable node is a router.
 12. The medium of claim 1, whereinthe instructions cause the processor to determine one or more segmentsof the second path by communicating tracing traffic from a node on thesecond path to the destination node, and noting the occurrence of thistracing traffic at one or more collection nodes.
 13. A monitoring systemcomprising: a controller that is configured to: receive anidentification of a source node and a destination node of asource-destination pair in a network, send a first trace request from amonitor node to the source node to determine a first path, find acontrollable node on the first path, the controllable node beingdifferent from the source node and the monitor node, send a second tracerequest from the controllable node to the destination node to determinea second path, and combine at least a part of the first path to thesecond path to determine a current path between the source node and thedestination node; a memory that is configured to store the second path;and a display that is configured to display information regarding thesecond path.
 14. The system of claim 13, wherein the controller findsthe controllable node by finding the controllable node along the firstpath that is closest to the source node.
 15. The system of claim 13,wherein the controller determines the second path by determining atleast a segment of the second path based on routing informationassociated with one or more elements along the second path.
 16. Thesystem of claim 13, wherein the controller: collects performance dataassociated with one or more elements on the determined current path,processes the performance data to provide performance information, andprovides a display of the performance information.
 17. The system ofclaim 16, wherein the controller: identifies abnormal performance basedon the performance data, and provides a display of a network diagrambased on the abnormal performance.
 18. A method comprising: receiving anidentification of a source node and a destination node of asource-destination pair in a network, sending, from a monitoring system,a first trace request from a monitor node to a source node of a networkto determine a first path, finding, by the monitoring system, acontrollable node on the first path, the controllable node beingdifferent from the source node and the monitor node, sending, by themonitoring system, a second trace request from the controllable node toa destination node of the network to determine a second path, andcombining, by the monitoring system, at least a part of the first pathto the second path to determine a current path between the source nodeand the destination node; storing, by the monitoring system, the secondpath to a memory at the monitoring system; and displaying, by themonitoring system, information regarding the second path.
 19. The methodof claim 18, wherein determining the second path includes determining atleast a segment of the second path based on routing informationassociated with one or more elements along the second path.
 20. Themethod of claim 18, including: collecting performance data associatedwith one or more elements on the determined current path, processing theperformance data to provide performance information, and displaying theperformance information.